Cleansing Wikipedia Categories using Centrality
نویسندگان
چکیده
We propose a novel general technique aimed at pruning and cleansing the Wikipedia category hierarchy, with a tunable level of aggregation. Our approach is endogenous, since it does not use any information coming from Wikipedia articles, but it is based solely on the user-generated (noisy) Wikipedia category folksonomy itself. We show how the proposed techniques can help reduce the level of noise in the hierarchy and discuss how alternative centrality measures can differently impact on the result.
منابع مشابه
Topic Identification Using Wikipedia Graph Centrality
This paper presents a method for automatic topic identification using a graph-centrality algorithm applied to an encyclopedic graph derived from Wikipedia. When tested on a data set with manually assigned topics, the system is found to significantly improve over a simpler baseline that does not make use of the external encyclopedic knowledge.
متن کاملAssessing the Quality of Wikipedia Pages Using Edit Longevity and Contributor Centrality
In this paper we address the challenge of assessing the quality of Wikipedia pages using scores derived from edit contribution and contributor authoritativeness measures. The hypothesis is that pages with significant contributions from authoritative contributors are likely to be high-quality pages. Contributions are quantified using edit longevity measures and contributor authoritativeness is s...
متن کاملCustomer Knowledge and Service Development, the Web 2.0 Role in Co-production
The paper is concerned with relationships between SSME and ICTs and focuses on the role of Web 2.0 tools in the service development process. The research presented aims at exploring how collaborative technologies can support and improve service processes, highlighting customer centrality and value coproduction. The core idea of the paper is the centrality of user participation and the collabora...
متن کاملThe web mirrors value in the real world: comparing a firm's valuation with its web network position
This paper compares a firm’s innovation and performance with its online Web presence measured through the Web network structure. 489 firms in five different industries listed on the United States and Chinese stock markets are investigated. Using Web link data collected from Bing, blogs, Twitter and Wikipedia, we find positive correlation between betweenness centrality of a firm in the Web netwo...
متن کاملEvaluating authoritative sources using social networks: an insight from Wikipedia
Purpose – The purpose of this paper is to present an approach to evaluating contributions in collaborative authoring environments and in particular wikis using social network measures. Design / methodology / approach – A social network model for wikipedia has been constructed and metrics of importance such as centrality have been defined. Data have been gathered from articles belonging to the s...
متن کامل